When reading news articles on social networking services and news sites, readers can view comments marked by other people on these articles. By reading these comments, a reader can understand the public opinion about the news, and it is often helpful to grasp the overall picture of the news. However, these comments often contain offensive language that readers do not prefer to read. This study aims to predict such offensive comments to improve the quality of the experience of the reader while reading comments. By considering the diversity of the readers' values, the proposed method predicts offensive news comments for each reader based on the feedback from a small number of news comments that the reader rated as "offensive" in the past. In addition, we used a machine learning model that considers the characteristics of the commenters to make predictions, independent of the words and topics in news comments. The experimental results of the proposed method show that prediction can be personalized even when the amount of readers' feedback data used in the prediction is limited. In particular, the proposed method, which considers the commenters' characteristics, has a low probability of false detection of offensive comments.
translated by 谷歌翻译
全根树的递归和分层结构适用于在各个领域代表统计模型,例如数据压缩,图像处理和机器学习。在大多数情况下,全根树不是随机变量;因此,避免过度装备的模型选择变得有问题。解决这个问题的方法是假设全根树上的先前分发。这使得基于贝叶斯决策理论可以避免过度装备。例如,通过将低的先前概率分配给复杂模型,最大后验估计器可防止过度拟合。此外,可以通过平均由其后后索加权的所有模型来避免过烧。在本文中,我们提出了一组全根树的概率分布。其参数表示适用于使用递归函数计算我们分发的性质,例如模式,期望和后部分布。尽管在以前的研究中已经提出了这种分布,但它们仅适用于特定应用。因此,我们提取他们的数学基本的组件,并推出了新的广义方法来计算期望,后部分布等。
translated by 谷歌翻译